Voice activity detection algorithms using subband power distance feature for noisy environments

نویسندگان

  • Tuan Van Pham
  • Michael Stadtschnitzer
  • Franz Pernkopf
  • Gernot Kubin
چکیده

In this paper, we propose two robust voice activity detection (VAD) methods for adverse environments. A single subband power distance (SPD) feature is estimated from different wavelet subbands and further improved to be robust against noise. The first method is based on a neural network that operates on an input vector which consists of the SPD feature and its first and second derivatives. The second method is an adaptive threshold-based algorithm that employs only the single SPD feature. A statistical percentile filter based on long-term information is enhanced to estimate the noise threshold more adaptively. A performance evaluation and comparison is carried out for the proposed and state-of-the-art VAD algorithms on the TIMIT database which was artificially distorted by different additive noise types. The results show that the invented VAD methods are very robust to environmental noise and mostly outperform the standard VADs such as the ETSI AFE ES 202 050 and ITU-T G.729 B.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved voice activity detection combining noise reduction and subband divergence measures

Currently, new trends in wireless communications are demanding reliable human-machine interaction in real-life environments. However, there are obstacles inhibiting automatic speech recognition systems (ASR) working in noisy environments. The main difficulty is the degradation suffered by ASR systems due to a mismatch between training and test conditions. This paper shows an improved voice acti...

متن کامل

A New Algorithm for Voice Activity Detection Based on Wavelet Packets (RESEARCH NOTE)

Speech constitutes much of the communicated information; most other perceived audio signals do not carry nearly as much information. Indeed, much of the non-speech signals maybe classified as ‘noise’ in human communication. The process of separating conversational speech and noise is termed voice activity detection (VAD). This paper describes a new approach to VAD which is based on the Wavelet ...

متن کامل

C-Means Clustering Applied to Speech Discrimination

An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built over a set of subband logenergies. Detecting the presence of speech frames (a new cluster) is achieved using a basic sequential algorithm scheme (BSAS) according...

متن کامل

Voice activity detection in noisy environments

The subject of this paper is robust voice activity detection (VAD) in noisy environments, especially in car environments. We present a comparison between several frame based VAD feature extraction algorithms in combination with different classifiers. Experiments are carried out under equal test conditions using clean speech, clean speech with added car noise and speech recorded in car environme...

متن کامل

Hard C-means clustering for voice activity detection

An effective voice activity detection (VAD) algorithm is proposed for improving speech recognition performance in noisy environments. The proposed speech/pause discrimination method is based on a hard-decision clustering approach built on a set of subband log-energies and noise prototypes that define a cluster. Detecting the presence of speech (a new cluster) is achieved using a basic sequentia...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008